Morphology Matters: A Multilingual Language Modeling Analysis
نویسندگان
چکیده
Abstract Prior studies in multilingual language modeling (e.g., Cotterell et al., 2018; Mielke 2019) disagree on whether or not inflectional morphology makes languages harder to model. We attempt resolve the disagreement and extend those studies. compile a larger corpus of 145 Bible translations 92 number typological features.1 fill missing data for several consider corpus-based measures morphological complexity addition expert-produced features. find that are significantly associated with higher surprisal when LSTM models trained BPE-segmented data. also investigate linguistically motivated subword segmentation strategies like Morfessor Finite-State Transducers (FSTs) these yield better performance reduce impact language’s modeling.
منابع مشابه
Judgment Language Matters: Multilingual Vector Space Models for Judgment Language Aware Lexical Semantics
A common evaluation practice in the vector space modeling (VSM) literature is to measure models’ ability to predict human judgments about lexical semantic relations between word pairs. Most existing evaluation sets, however, consist of scores collected for English word pairs only, ignoring the potential impact of the judgment language in which word pairs are presented on the human scores. In th...
متن کاملLearning Multilingual Morphology with CLOG
The paper presents the decision list learning system Clog and the results of using it to learn nominal innections of English, Ro-manian, Czech, Slovene, and Estonian. The dataset used to induce rules for the synthesis and analysis of the innectional paradigms of nouns and adjectives of these languages is the Multext-East multilingual tagged corpus. The ILP system Foidl is also applied to the sa...
متن کاملImmune synapses: mitochondrial morphology matters.
Proper positioning of mitochondria is critical for cellular function. Mitochondria localization close to synapses regulates signalling at neuronal and immune synapses (ISs). Vice versa, synapses influence activity, motility and the fusion/fission balance of close-by mitochondria. In this issue of The EMBO Journal, Baixauli et al (2011) identify a role for the mitochondrial fission factor dynami...
متن کاملMultilingual Natural Language Processing
With rapidly growing online resources, such as Wikipedia, Twitter, or Facebook, there is an increasing number of languages that have a Web presence, and correspondingly there is a growing need for effective solutions for multilingual natural language processing. In this talk, I will explore the hypothesis that a multilingual representation can enrich the feature space for natural language proce...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Transactions of the Association for Computational Linguistics
سال: 2021
ISSN: ['2307-387X']
DOI: https://doi.org/10.1162/tacl_a_00365